Measures of Quality of Rulesets Extracted from Data

نویسنده

  • Martin Holeňa
چکیده

The paper deals with quality measures of whole sets of rules extracted from data, as a counterpart to more commonly used measures of individual rules. This research has been motivated by increasingly frequent extraction of non-classification rules, such as association rules and rules of observational logic, in real-world data mining tasks. The paer sketches the typology of rules extraction methods and of their rulesets, and recalls that quality measures for whole sets of rules have been so far used only in the case of classification rulesets. It then proposes three possible ways how such measures can be extended to general rulesets. The paper also recalls the possibility to measure the dependence of classification ruleset on parameters of the classification method by means of ROC curves, and proposes a generalization of ROC curves to general rulesets. Finally, a brief illustration on rulesets extracted by means of the method GUHA is given.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fuzzification and Reduction of Information- Theoretic Rule Sets

If-then rules are one of the most common forms of knowledge discovered by data mining methods. The number and the length of extracted rules tend to increase with the size of a database, making the rulesets less interpretable and useful. Existing methods of extracting fuzzy rules from numerical data improve the interpretability aspect, but the dimensionality of fuzzy rulesets remains high. In th...

متن کامل

Games from Basic Data Structures

In this paper, we consider combinatorial game rulesets based on data structures normally covered in an undergraduate Computer Science Data Structures course: arrays, stacks, queues, priority queues, sets, linked lists, and binary trees. We describe many rulesets as well as computational and mathematical properties about them. Two of the rulesets, Tower Nim and Myopic Col, are new. We show polyn...

متن کامل

Transferability of Obia Rulesets for Idp Camp Analysis in Darfur

The analysis of refugee and IDP (internally displaced persons) camps from VHSR (very high spatial resolution) satellite imagery can assist humanitarian relief organisations by providing population estimations and camp structure analysis based on automated dwelling extraction. Since smooth transferability of rulesets in a fine scale and high complexity environment is limited, we present an appro...

متن کامل

Intrusion detection using fuzzy association rules

Vulnerabilities in common security components such as firewalls are inevitable. Intrusion Detection Systems (IDS) are used as another wall to protect computer systems and to identify corresponding vulnerabilities. In this paper a novel framework based on data mining techniques is proposed for designing an IDS. In this framework, the classification engine, which is actually the core of the IDS, ...

متن کامل

Using Text Reviews for Product Entity Completion

In this paper we address the problem of obtaining structured information about products in the form of attribute-value pairs by leveraging a combination of enterprise internal product descriptions and external data. Product descriptions are short text strings used internally within enterprises to describe a product. These strings usually comprise of the Brand name, name of the product, and its ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008